The Cache InjectionKofetch Architecture: Initial Performance Evaluation
نویسندگان
چکیده
One of t?ie major problems in. a n,umber of SM (Shared Memory) and DSM (Distributed Shared Memory) applications is the overall cost of read misses in conditions when.: (a) system latencies are relatively large, and (b) a shared data item is read relatively few times b.y each of the processors in the system; modern SM and DSM systerns are typically based on. off-the-shelf microprocessors which do not include an.y support for the described problem. Con.sequently, the major goal of our research is to come up with a new con.cept to be incorporated into the n.ext generation microprocessors, so they can become more eficient in the sense described above. Existing 64bit processors support only data prefetching (PF) as a method to fight against negative effects of the described problem. Our research introduces a mew concept referred to as cache injection (Cl), as well as the related cache injectionkofetch architecture (CICA). Initial performance evaluation is pegormed using a simulation methodology based on the set of synthetic benchmarks of interest for the research sponsor.
منابع مشابه
Compiler-Assisted Cache Replacement: Problem Formulation and Performance Evaluation
Recent research results show that conventional hardware-only cache solutions result in unsatisfactory cache utilization for both regular and irregular applications. To overcome this problem, a number of architectures introduce instruction hints to assist cache replacement. For example, Intel Itanium architecture augments memory accessing instructions with cache hints to distinguish data that wi...
متن کاملComputer Science Technical Report Evaluation of a Split Scalar/Array Cache Architecture
The widening gap between the processor clock speed and the memory latency puts an added pressure on the performance of cache memories. This problem is ampli ed by the increase in instruction issue per cycle. This paper reports on the initial evaluation of a split scalar and array data cache. This scheme allows an e cient exploitation of both temporal and spatial locality by having a di erent or...
متن کاملDesign and Performance Evaluation of a Multithreaded Architecture
Multithreaded architeclures have the abzlity to tolerate long memory latencies and unpredictable synchronization delays. I n this paper we propose a multithreaded architecture that is capable of exploiting both coarse-gram parallelism, and fine-grain instructionlevel parallelism in a program. Instruction-level parallelism is exploited b y grouping instructions from a number of active threads at...
متن کاملThe Split Spatial/Non-Spatial Cache: A Performance and Complexity Evaluation
A simple new method of detecting useful spatial locality is proposed in this paper. The new method is tested by incorporating it into a new split cache design. Complexity estimation and performance evaluation of the new split cache design is done in order to compare it to the conventional cache architecture and the split temporal/spatial cache design.
متن کاملThe Split Spatial/Non-Spatial Cache: A Performance and Complexity Evaluation
A simple new method of detecting useful spatial locality is proposed in this paper. The new method is tested by incorporating it into a new split cache design. Complexity estimation and performance evaluation of the new split cache design is done in order to compare it to the conventional cache architecture and the split temporal/spatial cache design.
متن کامل